Conditional probability distribution

Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value. If the conditional distribution of Y given X is a continuous distribution, then its probability density function is known as the conditional density function.

The properties of a conditional distribution, such as the moments, are often called by corresponding names such as the conditional mean and conditional variance.

1 Discrete distributions
2 Continuous distributions
3 Relation to independence
4 Properties
5 See also

Discrete distributions

For discrete random variables, the conditional probability mass function of Y given (the occurrence of) the value x of X, can be written, using the definition of conditional probability, as:

$p_Y(y\mid X = x)=P(Y = y \mid X = x) = \frac{P(X=x\ \cap Y=y)}{P(X=x)}.$

As seen from the definition, and due to its occurrence, it is necessary that $P(X=x) > 0.$

The relation with the probability distribution of X given Y is:

$P(Y=y \mid X=x) P(X=x) = P(X=x\ \cap Y=y) = P(X=x \mid Y=y)P(Y=y).$

Continuous distributions

Similarly for continuous random variables, the conditional probability density function of Y given (the occurrence of) the value x of X, can be written as

$f_Y(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)},$

where f_X,Y(x, y) gives the joint density of X and Y, while f_X(x) gives the marginal density for X. Also in this case it is necessary that $f_X(x)>0$ .

The relation with the probability distribution of X given Y is given by:

$f_Y(y \mid X=x)f_X(x) = f_{X,Y}(x, y) = f_X(x \mid Y=y)f_Y(y).$

The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.

Relation to independence

Random variables X, Y are independent if and only if the conditional distribution of Y given X is equal to the unconditional distribution of Y. For discrete random variables: P(Y = y | X = x) = P(Y = y) for all relevant x and y. For continuous random variables having a joint density: f_Y(y | X=x) = f_Y(y) for all relevant x and y.

Properties

Seen as a function of y for given x, P(Y = y | X = x) is a probability and so the sum over all y (or integral if it is a conditional probability density) is 1. Seen as a function of x for given y, it is a likelihood function, so that the sum over all x need not be 1.

Conditional probability distribution

Contents

Discrete distributions

Continuous distributions

Relation to independence

Properties

See also